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Abstract. We study the state complexity of boolean operations and 
product (concatenation, catenation) combined with star. We derive tight 
upper bounds for the symmetric differences and differences of two lan- 
guages, one or both of which are starred, and for the product of two 
starred languages. We prove that the previously discovered bounds for 
the union and the intersection of languages with one or two starred argu- 
ments, for the product of two languages one of which is starred, and for 
the star of the product of two languages can all be met by the recently 
introduced universal witnesses and their variants. 
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1 Introduction 

The state complexity of a regular language is the number of states in the min- 
imal deterministic finite automaton (DFA) recognizing the language. The state 
complexity of an operation on regular languages is the worst-case state complex- 
ity of the result of the operation as a function of the state complexities of the 
arguments. For more information on this topic see [112111] . 

Let K and L be two regular languages over alphabet S, and let their state 
complexities be m and n, respectively. In 2007 A. Salomaa, K. Salomaa, and 
Yu [TU] showed using ternary witnesses that the complexity of [K U L)* is 
2 m+ ™- 1 - (2" 1 " 1 + 2"- 1 - 1). They also established a lower bound for (K H L)* 
using an alphabet of 8 letters. These results were improved by Jiraskova and 
Okhotin [9] who showed that binary witnesses suffice for (K U L)*, and that 
3-2 m ™ -2 is a tight upper bound for (KdL)*; they used an alphabet of 6 letters. 
In 2012, Gao and Yu [8] showed with ternary witnesses that the complexity of 
K U L* is m(2™ -1 + 2 n ~ 2 — 1) + 1, and that the same upper bound applies to 
K n L* . Moreover, it was shown in [6] by Gao, Kari and Yu that quaternary 
witnesses meet the bound (2™" 1 + 2 m ~ 2 - l)(2 n " 1 + 2"" 2 - 1) + 1 for K* U L* 
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and K* n L*. In 2008, Gao, K. Salomaa, and Yu [7] demonstrated using qua- 
ternary witnesses that 2 m+n - 1 + 2 m +™" 4 - (2 m " 1 + 2™" 1 - m - 1) is a tight 
upper bound for (KL)* . The complexity of i4fL* was studied by Cui, Gao, Kari 
and Yu [5] in 2012. They proved with ternary witnesses that the tight bound is 
m(2" _1 + 2™~ 2 ) — 2™~ 2 . The same authors also showed in [4] using quaternary 
witnesses that the complexity of K*L is 5 • 2 m+ " -3 — 2 m ~ 1 — 2" + 1. In summary, 
nine operations using union, intersection, and product (also called concatenation 
or catenation) combined with star have been studied. 

To establish the state complexity of an operation one finds an upper bound 
and languages to act as witnesses to show that the bound is tight. A witness 
is usually a sequence (L n \ n ^ k) of languages, where k is some small positive 
integer; we will call such a sequence a stream of languages. The languages in a 
stream normally differ only in the parameter n. In the past, two different streams 
have been used for most binary operations. 

Recently, Brzozowski [5] proposed the DFA U n (a, b, c) — (Q, 27, 5, 0, {n— 1}) of 
Fig. [1] and its language U n (a, b, c) as the "universal witness" DFA and language, 
respectively, for n ^ 3. The restrictions of the DFA and the language to alphabet 
{a, b} are denoted by W„(a, 6, 0) and U n (a, b, 0). It was proved in [5] that the 
bound 2 n_1 + 2™~ 2 for star is met by U n (a, b, 0), and the bound 2 ra for reversal, 
by U n (a 7 b,c). The bound (m — 1)2™ + 2™~ 1 for product is met by U m (a,b,c) 
and U n (a, b, c). The bound mn for union, intersection, difference (K \ L) and 
symmetric difference {K © L) is met by the streams U m (a, b, c) and U n (a, b, c) if 
to n, as was conjectured in [2] and proved in [3]. If to = n, it is necessary to use 
two different streams; however, it is possible to use streams that are almost the 
same, in the following sense. Two languages K and L over 27 are permutationally 
equivalent if one can be obtained from the other by permuting the letters of the 
alphabet, and a similar definition applies to DFA's. It was proved in [2] that 
two permutationally equivalent streams U m (a,b,c) and U n (b,a,c) are witnesses 
to the bound for the boolean operations: union (K U L), intersection (K n L), 
difference (K \ L), and symmetric difference (K © L). Thus U n (a, b, c) is indeed 
a universal witness for the basic operations. 




Fig. 1. DFA U n {a, b, c) of language U n (a, b, c). 

It turns out that the witness U n (a,b,c) cannot meet the bound for some 
combined operations. However, the notion of universal witness can be broadened 
to include "dialects" of U n (a, b, c). Some terminology is required, before we define 
this concept. 
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The inputs of DFA U n perform the following transformations on the set 
Q = {0, . . . , n— 1} of states. Input a is a cycle of all n states, and this is denoted 
by a : (0, . . . , n — 1). Input & is a transposition of and 1, and does not affect 
any other states; this is denoted by b : (0, 1), and by b : (i,j), if * and j are 
transposed. Input c is a singular transformation sending state n — 1 to state 0, 
and not affecting any other states; this is denoted by c : ("q 1 ), and by c : Q), 
in general. The constant transformation sending all states to state i is denoted 
by (^). The identity transformation on Q is denoted by 1q. 

It is known [5] that the inputs of U n (a, b, c) of Fig. [TJ perform all n n trans- 
formations of states. 

A dialect of U n (a,b,c) is the language of any DFA with three inputs a, &, 
and c, where a is a cycle of length n as above, b is the transposition of any two 
states and c is a singular transformation c : Q) sending any state i to any 

state j. The initial state is always 0, but the set of final states is arbitrary, as 
long as the resulting DFA is minimal. 

Since there are operations for which ternary witnesses do not meet the worst- 
case bounds, the notions of universal witness and dialect have been extended to 
quaternary alphabets [5] , by adding a fourth input d which performs the identity 
permutation, denoted by d : 1q. The concepts of permutational equivalence and 
dialects were extended in the obvious way to quaternary languages and DFA's. 
The following dialects are used in this paper: 

1. W{o}, n (a, b, c), which is I4 n (a, b, c) with {0} as the set of final states. 

2. T n (a, b, c) = (Q, S, St, 0, {n — 1}), where a : (0, . . . ,n — 1), b : (0, 1), and 

c: (o)- 

3. W n (a, b, c, d) — (Q, <5w, 0, {n— 1}), where a : (0, . . . , n— 1), b : (n— 2, n— 1), 
c : (J), and d : 1q. 

4. W{o},n(«; b, c, d), which is W„(a, b, c, d) with {0} as the set of final states. 

We use the convention that X is a DFA if and only if X is its language. The oper- 
ation KoL represents any one of the four boolean operations union, intersection, 
difference and symmetric difference. 

In this paper, we consider the following 13 operations that use boolean op- 
erations and product combined with star : 

KUL*,KnL*,K © L*, K\L*, L*\K, 

K* UL*,K* DL*,K* ®L*,K*\L*, 

KL*,K*L,K*L*,(KL)*. 

Our contributions are as follows: 

1. We derive the bound m(2"- 1 + 2™- 2 - 1) + 1 for K m \ L*, L* n \ K m and 
K m ®L* n . We show that the known bounds for K m \JL* n , K m ®L* n and L*\K m 
are met by the streams U m (a,b, c) and U n (b,a,c), and that, for K m U L* 
and K m \ L*, the dialect C/{o},m(«: b, c) and the language U n (b,a,c) act as 
witnesses. This corrects an error in [5], where it is claimed that the witnesses 
that serve for union also work for intersection. 

2. We derive the bound (2 m ~ x + 2™- 2 - 1)(2™- 1 + 2™~ 2 - 1) + 1 for K^\L* n , and 
K* ®L*. We show that the known bounds for K*UL* and K*DL* are met 
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by the dialects W m (a,b,c,d) and W n (d,c,b,a), and that, for \ L* n and 
Km © L* n i the dialects Wj } jTO (a, b, c, d) and W n (d, c, b, a) act as witnesses. 

3. We prove that the known bound to(2™" 1 + 2™~ 2 ) - 2™~ 2 for K m L* is met 
by the dialects T m (a, b, c) and T n (b, a, c). 

4. We show that the known bound 5 • 2 m+ ™- 3 - - 2" + 1 for A^L„ is 
met by U m (a, b, c, d) and U n (d, c, b, a). 

5. We derive the bound 2 m+n - 1 ~ 2™- 1 - 3 • 2"~ 2 + 2 for K* m L* n and show that 
it is met by U m (a, b, c, d) and U n (d, c, b, a). 

6. We prove that the known bound 2 m+ ™- 1 + 2 m +"- 4 - (2 7 "" 1 + 2"- 1 - m - 1) 
for (K m L n )* is met by W m (a, b, c, <i) and W„ (<i, c, 6, a) . 

7. In obtaining these results, we prove Conjectures 7, 9, 10, 12, 15 and 17 of [2J. 

Sections [JJ and [3] study boolean operations with one and two starred argu- 
ments, respectively. Products with one or two starred arguments are examined in 
Section[3J In Section[5]we consider stars of product, intersection, and difference, 
and Section [5] concludes the paper. 

2 Boolean Operations with One Starred Argument 

Recall that the complexity of L* is 2" _1 + 2™~ 2 . Gao and Yu [5] showed that 
the complexity of K m U L* is m(2 n " 1 + 2"~ 2 - 1) + 1. They used the following 
DFA's over alphabet S = {a,b,c}: For K, let V K = (Q K ,E,5 K ,0,{m - 1}), 
with Qk — {0, . . . , m — 1}, a,b : 1q k , and c : (0, . . . n — 1). For L, let T>l = 
(Ql,£, Sl 7 0, {n — 1}), with Ql = {0, . . . , n — 1}, a : (0, . . . , n — 1), b defined by 
<5l(0, b) = 0, 6jj(i, b) = i + 1 (mod n), for i = 1, . . . , n — 1, and c : Iq^ . They 
showed that the same bound also holds for K m n L* n , and claimed that the same 
witnesses work. That claim is incorrect, however, as is shown below. 

The results of [5] for union are extended here to K m U L* , K m © L* and 
L* n \ K m with witnesses U m (a, &, c) and £/„ (&, a, c) , and to if m n L* and A" m \ L* n 
with witnesses (7{o},ra(fl,M) and U n (b,a,c). 

Proposition 1. Let K rn and L n be two regular languages with complexities m 
and n. Then the complexities of K m o L* and L* n \ K m are at most m(2" _1 + 
2™~ 2 - 1) + 1, for n ^ 3. 

Proof. Let V x = (Q 1 ,S,5 1 ,0,F 1 ) with Qi = {0, . . . , m - 1} be the DFA of 
K m , and let V 2 = (Q 2 , S, 5 2 , 0, F 2 ) with Q 2 = {0, . . . , n - 1} be the DFA of 
L n . Construct A/2, an NFA accepting L* L , by adding a new final state s to £> 2 , 
with the same outgoing transitions as state 0, and e-transitions from each final 
state in F 2 to 0. Now Af 2 has initial state {s} instead of {0}. See Fig. [2] for 
an illlustration. Let £2 be the minimal DFA obtained from Af 2 by the subset 
construction and minimization, and let V be the direct product of T>\ and 1S2 • 

For all five boolean operations, the states of V are ordered pairs, where the 
first element is a state i S Qi and the second is either {s} or a subset of Q 2 . 
Because of the e-transitions, the allowable states are (0, {s}), all states of the 
form (i, S) where S is non-empty and SnF 2 =0, and all states of the form (i, S) 
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DFA V x of U 4 {a,b,c) 



NFA A/2 of (f/ 5 (6,a,c)) 



Fig. 2. DFA Vx of C/ 4 (a, 6, c) and NFA JV 2 of (U 5 (b, a, c))*. 

where 5 contains at least one final state together with 0. The total number of 
possible states is largest if there is only one final state, say n — 1. Hence the 
number of states in V cannot exceed 1 plus m(2 ,l_1 — 1) for states of the form 
(i, S) where S is non-empty and n — 1 ^ S, and m2™~ 2 for states of the form 
(i, S) where 0, n — 1 £ S 1 . Therefore the complexity of K m o L* n and L* \ K, n 
cannot exceed 1 + m(2 n_1 + 2"~ 2 - 1). □ 

Theorem 1 (AToL*). Let K m — U m (a,b,c) and L n = U n (b,a,c). Form,n ^ 3, 
the complexities of K m \JL* n , K m ®L* n , andL* n \K m are all m(2"~ 1 +2"~ 2 -l) + l. 
Let K' m be the language U^ ^ m (a,b,c). Then the complexities of K' m D L* n and 
K' m \ L* are also m(2™- 1 + 2"- 2 - 1) + 1. 

Proof. Let the various automata be defined as in the proof of Proposition [T] 
but this time with K m — U m (a,b,c) and L n — U n (b,a,c). We show that all 
m(2™ _1 + 2™~ 2 — 1) + 1 allowable states of V are reachable. We use the notation 
(i,S) U,T) to denote that state (j,T) is reached from (i, S) by word w. 

We have (0,{s}) A (0, {0}) (&a) ' *> (i,{0» for 2 < i < m - 1. If m is odd, 

(0, {0}) ^> (1, {0}); if m is even, (0, {0}) (1, {0}). 

Brzozowski showed in [2] that all allowable states of A/2 are reachable from 
{0} by words in {a, b}* . These words act as permutations on 2?!. To reach state 
(i, S) apply the word w that takes {0} to S in A2 to state (j, {0}), where j is 
such that j -H> i. Therefore all the allowable states are reachable. 

For distinguishability, first consider two states (i, S) and (j, T), where S ^ T. 
Then there is a k either in S\T or in T\S; without loss of generality, assume 
k e S\T. By applying b n ^ 1 ^ k , we reach states («', S") and (j', T'), where n - 1 e 
S'\T'. Note that applying some cyclic shift a' to 2?i, we reach states («", 5") and 
(j",T"), where n — 1 G S"\T". These states are distinguishable for the boolean 
operations as follows: 

— K m U L*, if m © L* , L^\K m : apply a cyclic shift so j' are non-final in £>i. 
This is possible since as T>\ has a single final state and m ^ 3. 

— AT^ n i* : map i to the final state of T>\. 

— AT^j\L* : map j to the final state of T>\. 

Now consider two states (i, S) and (j, S), i < j. We may assume j < m — 1 
because, since m ^ 3, we can apply a cyclic shift of a's so that neither i nor j 
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is equal to to — 1. Doing so might change S to S\ but S' is the same in both 
states and S' remains non-empty. The states are distinguishable as follows: 

- K m U L* n , K m © L* n , K' rn \L* n : apply c so that n—l£S, then a k for some k 
to map j to a final state. 

— K' m n L*,L*\if m : since S is non-empty, apply a cyclic shift so n — 1 G 5, 
then another shift so j is final, and hence i is non-final. 

Finally, note that only states (0, {s}) and (0, {0}) reach (1, {1}) on applying 
a; therefore by the previous argument, (0, {s}) is distinguishable from all other 
states except possibly (0, {0}). Note now that states (0, {s}) and (0, {0}) are 
distinguishable in K m U L* , K m © L* and L* \ K mi but equivalent in K m n L* n 
and iif m \ L* n . Hence we cannot have the same witnesses for both intersection 
and union. However, the choice of final states distinguishes (0, {s}) from (0, {0}) 
for K' m (~l i* and K' m \L^. Therefore all reachable states are distinguishable. □ 

3 Boolean Operations with Two Starred Arguments 

Gao, Kari and Yu [6] showed that the bounds for K* n U L* and n L* n are 
both (2™- 1 + 2 m - 2 ~ - 1)(2"- 1 + 2"~ 2 - 1) + 1. They used the following DFA's 
over alphabet S — {a,b,c,d}: For K, let T>k = (Qk, 2J, 6k, 0, {to — 1}), with 
Qk = {0, . . . , to — 1}, a : (0, . . . ,m — 1), b defined by 6k(0, b) — 0, b) = i+1 
(mod to), for i — 1, . . . , to — 1, and c,d : 1q k . For L, let T>l — (Ql, S, 6l, 0, {n — 
1}), with = {0, . . . , n — 1}, a, 6 : 1q l , c : (0, ... n — 1), and <i defined by 
6r-(0, d) — 0, c?) = i + 1 (mod n), for i = 1, . . . , n — 1. 

We extend these results to if m © L* and if* \ L* m , for which we now derive 
upper bounds. 

Proposition 2. Let K m and L n be two regular languages with complexities to 
and n. Then the complexities of K* m o L* are at most (2™- 1 + 2 m - 2 ~ 1)(2"- 1 + 
2"- 2 - 1) + 1 for m,n^ 3. 

Proof. Let Pi = (Q x , Z", 6i, 0, Fi) be the DFA of K m , and P 2 = (Q2, S, 62, 0, F 2 ), 
the DFA of i„. Let M (JV" 2 ) be the NFA for if^ (L*) obtained by adding a new 
initial and final state s% (s^), transitions from state s\ (s 2 ) the same as from 
in T>i {T> 2 ), and an e-transition from each final state of T>\ (T> 2 ) to the initial 
state of T>i (D 2 ). See Fig. |3]for an example of this construction. Let Si and 
<S2 be the minimal DFA's obtained from Afi and Af 2 by the subset construction 
and minimization. Finally, let V be the direct product of S\ and £2- 

The states of V are ordered pairs, where the first element is a subset of 
{si} U Qi and the second is a subset of {s 2 } U Q 2 . Note that si and s 2 can only 
appear in the initial state ({si}, {s 2 }) of V . After any input is applied to V, the 
state has the form (S,T), where S is a state of Si other than {si} (there are 
at most 2 m ~ 1 + 2 m_2 — 1 such states), and T is a state of £2 other than {s 2 } 
(there are at most 2™ _1 + 2™~ 2 — 1 such states), and this is independent of the 
witnesses used. Thus (2 rn - 1 + 2 m - 2 - 1)(2"- 1 + 2"~ 2 - 1) + 1 is an upper bound 
for the number of states of the DFA for K* o L*. □ 
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NFA Mi of (W 4 (a, b, c, d))* 



NFA N 2 of (W 5 (d, c, b, a))* 



Fig. 3. NFA'sTVi and M 2 of {W±(a,b,c,d))* and (W 5 (d, c, b, a))*. 

Theorem 2 (K* o L*). Let K m = W m (a,b,c,d) and L n = W n (d,c,b,a). For 
m,n ^ 3, the complexities of A,*„ U L* n and n L* n are (2™- 1 + 2 m ~ 2 - 
l)(2 n + 2™~ 2 — 1) + 1. If K' m is the language o/W{o}.m 7 then the complexities 
of(K' m )*\L* and {K' m )*®L* n are also (2 m ~ 1 + 2 m ~ 2 - 1)(2™- 1 + 2™~ 2 - 1) + 1. 

Proof. Let the various automata be denned as in the proof of Proposition [2j 
but this time with K m = W m (a,b,c,d) and L n — W n (d, c, b, a). We now show 
that all (2 rn - 1 + 2 m ~ 2 - l)(2 n - a + 2™~ 2 - 1) + 1 allowable states discussed in 
Proposition [2] are reachable. 

We first show that all allowable subsets of Qi are reachable in X>i, ignoring 

T> 2 . First, {si} A {0} ^—A {0, m - 1}. Suppose all states S with {0, m - 1} C 
S C Qi, \S\ = fc, fc ^ 2 are reachable. All states S with {0, 1} C S C Qi of size fc 
are now reachable by applying a. If S 1 = {ii, . . . , i^} with ii < ■ • • < if. < m — 1, 

let j = i 2 - h - 1; then {0, 1, i 3 - j - ii, . . . , i fc - j - ix} > S. 

Now states {0, m — 1} C S of size fc + 1 can now be reached as follows: 
{ii - 1, . . . ,z fc _i - l,m-2} A {0,ii, . . .,i k -i,m- 1}. 

Therefore all allowable states of T>\ are reachable by words in {a,c}*. 

In A/2, a and c map states s 2 and to 0. Therefore all allowable states of V of 
the form (S, {0}) are reachable. A symmetric argument shows that all states T 
of T>2 are reachable by words in {b 2 , d}* (as b 2 and b are the same transformation 
on V 2 ). All of these words map states S C Qi to themselves, except in the case 
0, m — 1 ^ 5, m — 2 G 5. Let 5 = {ii, . . . , if.} be such a state; then for all 
allowable T, ({ix — 1, . . . , ifc — 1}, T) is reachable, and reaches (S, T) when a is 
applied. Therefore all allowable states are reachable. 

Next we show that all the states of V are distinguishable. Recall that for 
K* n \JL* n and AT^nL* , we use {to-1} as the final state of M, and for {K' m )*®L* n 
and (K' m )*\L* n , we use {0}. 

Suppose we have states (Si,Ti), (S 2 ,T 2 ) with Ti T 2 . Then there is a fc 
cither in Ti\T 2 or in T 2 \Ti; without loss of generality, assume fc £ T{\T 2 . By 
applying cZ™- 1 ^, we reach states (5i,T{) and (5 2 ,T 2 '), where u - 1 € T{\T^. 
Apply c 2 ac 2 so that T[ and T 2 are unchanged, but now 1,2 £ S[ U S 2 . Then 
apply a m ~ 2 so 0, to - 1 <£ 5" U 5 2 . This distinguishes the two states for K* n U L* 
and (K' m )* © A* . For K* m Pi L*, since 5i 7^ 0, we may apply a cyclic shift to X>i 
so that m-1 6 5J to distinguish the states. For (K' m )*\L^ , we can assume that 
h e S%, and use a m - 1 - h to map S' 2 ' to S 2 ", where {0,m- 1} C 5 2 ". This also 
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maps S" to S"', and keeps T[ and T' 2 unchanged. Since n — 1 G T-^T^, we have 
(Sf , T{) is non-final and (S% f , T 2 ') is final for {K' m )*\L* n . 

Now suppose 5i ^ 52- For AT,^ U L* and fl L* the above argument is 
symmetric. For the other two operations, apply a cyclic shift so that m — 1 G 
SiV^. Now apply (c6a) m ~ 3 so that m- 1 G S'{\S 2 \ and 2, . . . , m - 2 £ 5f U 
Apply a so that G S?'\S%'. Then as above, apply 6 2 d"~ 2 so that n-1 £ T/uT^, 
while leaving S 1 "' and S 2 " unchanged. This distinguishes the states for (if^ ri )*\L* 
and(^)*©L*. 

Therefore all (2™- 1 + 2 m ~ 2 - 1)(2™- 1 + 2"~ 2 - 1) states of the form (S,T) 
are distinguishable. It remains to distinguish ({si}, {s 2 }) from the other states. 
As in TheoremQ] ({si},{s2}) is distinguished from all states except ({0},{0}) 
by a. It is distinguishable from ({0},{0}) by the choice of final state of X>i. □ 

4 Products with Starred Arguments 
4.1 The Language KL* 

The complexity of KL* was studied by Cui, Gao, Kari, and Yu [5]. They showed 
that m(2 n_1 + 2™~ 2 ) — 2™~ 2 is a tight bound using the following witnesses 
over alphabet S = {a, 6, c}: For K, let T>k — (Qk, S, 5k , go, {m — 1}), with 
Qk = {go, ■ ■ ■ ,q m -i}, a : (go, ■ • ■ Am-i), Sniqi, b) = q l+ i for i = 0, . . . , m - 3, 
5k (q m -2,b) = g , S K (q m -i,b) = g m _ 2 , and S K (qi,c) = q l+1 for i = 0, . . . ,m-3, 
5 K (q m -2,c) = g , 6 K (q m -i,c) = q m -i- For L, let V L = (Q L , £, 6 L ,0, {n - 1}), 
with Q L = {0,...,n- 1}, a : (0,...n- 1), S L (0,b) = 0, S L (i,b) = i + 1 for 
i = 1, . . . , n — 2, 6(n — 1, b) = 1; c : (™7 )• We prove that two permutationally 
equivalent dialects of U n (a,b,c) also meet the bound. 

Theorem 3 (KL*). Let K m — T m (a,b,c), and L n = T n (b,a,c). For m,n ^ 3, 
the complexity of K m L* is m(2 n - 1 + 2™~ 2 ) - 2"~ 2 . 

Proof. Let 2?! = (Q lt E, 8 X , q , {g m _i}) with Q 1 = {q Q , . . . , g m _i} be the DFA 
of if m , and let V 2 = {Q 2 ,S,5 2 ,0,{n-l}) with Q 2 = {0, . . . ,n - 1} be the DFA 
of L n . Let A/2 be the NFA for L*, and let Af be the NFA for the product K m L* n . 
Figure E] shows our witnesses 7i(a, b, c) and 75(6, a, c) and the NFA Af for KL*. 
We perform the subset construction and minimization of Af to obtain the DFA 
V for the product KL* . 




Fig. 4. Witness A/" for 71 (a, b, c)(%(b, a, c))*. 
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The states of V are subsets of Q\ U Q2 U {s}. Note that g m _i cannot appear 
in a state of V without s, and vice versa. Also, n — 1 cannot appear without 0, 
but can appear without n — 1. Each state of £> must contain exactly one of 
{go}, • ■ • , {g m _2} or {g ln _i, s}, and either a (possibly empty) subset of Q2 not 
containing n — 1, or subset of Q2 containing both n — 1 and 0. Hence there are 
at most m(2" -1 + 2"~ 2 ) reachable subsets; we now show that all these subsets 
can be reached. 

Set {go} is the initial state of V, set {g^} for i ^ m — 2 is reached by a 1 , and 
{q m -i,sj, by a™" 1 . 

Suppose all allowable states of the form {g m _i,s} U S, \S\ ^ k, k ^ 0, 
are reachable. Let S 1 C Q 2 , \S\ = k + 1. Ifl 6 5 and ^ S, then we have 
{gm-i,*}U(5\{l}) A {g }US. HO, 1 G 5, then {g m _i,a}U(5\{0» A {g }US. 
If G S and 1 £ S, then {g m _i, s} U (5\{0» ^> {g } U 5. Therefore all states 
{qo} U 5, 1 5 1 = k + 1, and either 6 S or 1 £ S, are reachable. Every state 
{qo} U S 1 , where n — 1 ^ 5, is reachable by an even number of b's from a state 
containing either or 1. Every S — {0, i 1; . . . , i^-i, n — 1} is also reachable in 
this way (by mapping either or 1 to So all states {g } U S, \S\ = k + 1, 
are reachable. By applying cyclic shifts a 1 , all states {qi} U S, i < m — 1 and 
{tjVra-i) s} U S* are reachable. 

Any state of the form {g m _i,s}UT, where T C Ql\{0, ra— 1}, is equivalent to 
{q m -i, s, 0} U T, as they are both final and are mapped to the same state under 
any input. So the number of distinguishable states of T> is at most m(2" _1 + 
2"~ 2 ) — 2"~ 2 . We prove that there are precisely that many distinguishable states. 

Consider two states of the form {qi} U S, {g m _i, s} U T, where i < m — 1. 
These states are distinguished by cb n ~ 2 . Any pair {qi} U 5, {<£,-} U T, i ^ j can 
by transformed into states of this form by applying a cyclic shift. Now consider 
{qt} U S, {qi} U T, S ^ T, i < m — 1. There exists a cyclic shift 6 fc which 
transforms the states so that n — 1 G 5 © T, and this distinguishes the states. 

Then the only remaining case is {g m _i, s} U 5, {g m _i, s} UT, and S ^ T. As 
we stated earlier, if SffiT = {0} then the states are indistinguishable. Otherwise, 
let k G S © T, jfe > 0. Apply S o that n - 1 G S © T. Then applying a to 

map {q m -i,s} to {go, 1} distinguishes the states. □ 

4.2 The Language K*i 

Cui, Gao, Kari and Yu [4 proved using quaternary witnesses that the complexity 
of K*L is 5 ■ 2 m+n " 3 - 2 m ~ 1 - 2™ + 1. Let £ = {a,b,c,d}. For K they used 
T>k = (Qk, S, $K,qo, {m - 1}), with Q K = {go, ■ ■ ■ ,q m -i}, a : (go, ■ • ■ , g m -i), 
<5ft"(go, 6) = go, Si{(qi,b) = i + 1 mod m for i = 1, . . . , m — 1, and c, <i : Iq^ . 
For L, their witness was T>l = (Ql, S, Sl, 0, {n — 1}), with Ql = {0, . . . ,n — 
1}, a, b : 1q l , c : (0, ...n — 1), d : We show here that two quaternary 

permutationally equivalent languages also work. 

Theorem 4 (K*L). Let K m — U m (a,b,c,d) and L n = U n (d,c,b,a). Form,n 
3, the complexity of K* n L n is 5 • 2 m +"~ 3 - 2™" 1 - 2" + 1. 
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A/I for (U4,(a,b,c,d))* T>2 for Us(d,c,b,a) 



Fig. 5. NFA TV for (C7 4 (o, b, c, d))* U 5 {d,c,b : a). 



Proof. Let T> x = (Q x , S, Si, q 0> {q m -i}) with Q x = {q 0> . . . , q m -i} be the DFA 
ofK m , and let TJ 2 = (Q 2 ,S,S 2 ,0,{n-l}) with Q 2 = {0, • . . ,n - 1} be the DFA 
of L n . Let M be the NFA for K* m , and let M be the NFA for the product K* n L n . 
We perform the subset construction and minimization of Af to obtain the DFA 
V for the product K*L. The construction is illustrated in Fig. [SJ 

Owing to the e-transitions, the allowable states of the DFA are {s,0}, all 
( 2 m-i _ X )( 2 « _ i) su bsets of the form S U T where C S C Q u , q m _ x £ S, 
I C T C Q 2 , and all (2 m ~ 2 - 1)(2"- 1 - 1) subsets of the form S U T, where 
go, q m -x G 5 C Qi and e T C Q 2 . There are 5 • 2 m +™- 3 - 2™- 1 - 2™ + 2 such 
subsets and we will now show that they are all reachable. 

The initial state ofV is {s, 0}. It is known from [2] that all allowable subsets 
of Mi are reachable by words in {a, b}*. These inputs all map to itself, and 
hence all allowable states of the form S U {0} are reachable. 

Uq m -i ^ S and T = {ti,...t k }, then S U {0, t 2 - ti, . . . ,t k - h} SUT. 
Let T = {Q,ii,.. .,t k }, < t x < ■ ■ ■ < t k , and S = {q ll ,. . . , q k }, i x < ■ ■ ■ < ii < 
m - 1. Also, let S' = {g^-^-i, . . . , q m -2} and T = {ti, . . . ,t k }. Then 

S'UT'^> {0, q i2 - H , . . . , q H _ 4l } U T ^ S U T. 

Moreover, Sli{t , ti+to, . . . , t k +t } can be reached from SUT by d to . Combining 
these results shows that all allowable states SLIT with q m - X £ S are reachable. 
Finally, if S = {qo,qn,- ■ .,q ik ,q m - X }, and £ T, then {q^-i, . . . , q^-i, q m - 2 } U 
T A S U T. Therefore all allowable states are reachable. 

For distinguishability, first consider states Si U Ti, S 2 U T 2 . If Ti ^ T 2 , 
then applying a cyclic shift d k transforms the states so that n — 1 S T x © T 2 , 
distinguishing the states. If Si ^ S 2 , apply a cyclic shift a k so that <7 m _i G 
Si © S 2 . Then apply 6d so that G Ti © T 2 , and the states are distinguishable 
by the previous case. 

Finally, the initial state {s} U {0} is indistinguishable from {qo} U {0}, as any 
non-empty input transforms these two states into the same state. So then there 
are 5 • 2 m+n - 3 - 2" 1 " 1 - 2™ + 1 distinguishable states. □ 
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4.3 The Language K*L* 



The combined operation K*L* appears not to have been studied before. 

Proposition 3. The complexity of the operation K* n L* n is at most 2 m+n ~ 1 — 
2™- 1 - 3 • 2 n -' 2 + 2 for m, n > 3. 

Proof. Let V 1 = (Q 1 , U, Si, q , F±) with Qi = {q , . . . , q m -i} be the DFA of 
K m , and let V 2 = {Q 2 , S, S 2 , 0, F 2 ) with Q 2 = {0, . . . , n - 1} be the DFA of 
L n . Construct NFA's J\fx and M 2 accepting and L* n by adding new initial 
states s\ and s 2 , which are also final. Let Af be the NFA for K* n L„, and let V 
be the DFA obtained by the subset construction and minimization of Af. These 
constructions are illustrated in Fig. [5] 




NFA Mi of (C/ 4 ( a , b, c, d))* NFA Mi of (U 5 (d, c, b, a))* 



Fig. 6. NFA Af of (C/ 4 (a, 6, c, d))*(U 5 (d, c, 6, a))*. 



The initial state of V is {si,s 2 }. Note that any state R oi V containing s 2 
but not 0, is equivalent to i?U {0}, since both states are final because of s 2 , and 
s 2 and have identical outgoing transitions. Hence we can ignore states like R 
in our counting, and assume that every state containing s 2 also contains 0. Due 
to the e-transitions, the allowable states of the DFA are {si, s 2 }, and all subsets 
of the form SUT, where I C S C Q 1 , C T C {s 2 } U Q 2 , and fall into one of 
the following cases: 

- S n F 1 = 0, T n F 2 = 0; 

- 5 (~1 Fi = 0, T contains at least one state of F 2 and 0; 

- S contains at least one state of Fi and s 2 , e T. 

One verifies that the possible number of states is greatest when there is only one 
final state, say q m -i, in F\ and only one final state, say n— 1, in i 7 ^- Hence we 
have the cases: 

- q m -i i S, n- 1 $ T: (2™- 1 - 1)(2™- 1 - 1) states; 

- q m -i i S, 0,n- 1 e T: (2™- 1 - 1)2"~ 2 states; 

- go,?m-i e 5, s 2 ,0 G T: 2 m +™- 3 states. 

Therefore there are a total of 2 m+ "^ 1 - 2™- 1 - 3 • 2"" 2 + 2 allowable states. 
Hence the complexity of K* m L* n is at most 2 m +"- 1 - 2 1 "- 1 - 3 • 2"~ 2 + 2. □ 
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Theorem 5 (K*L*). Let K m — U m (a,b,c,d) and L„ = U n (d,c,b,a). For 
m, n^3, the complexity of ' K* m L* n is 2 m+n ~ 1 - 2™- 1 - 3 • 2™~ 2 + 2. 

Proof. Let the various automata be defined as in the proof of Proposition [31 
but this time with K m = U m (a, b, c, d) and L n = U n (d, c, b, a). The reachability 
of all of the states of V follows the proof in Theorem [4] for all states S U T 
where n—l^T. Let T = {0,*i, . . . ,t k ,n - I}. If q m -i £ S, then S U {0,t 2 - 

ti, .. .,t k -ti,n-l-ti} — > SUT. If g ro _i G 5, say 5 = {gcfc,- ■ • ) 9i i) 9m-i}, 
then {q i± — 1, . . . , <ft,_i, <?m-2} UT -4 SUT. Therefore all allowable states are 
reachable. 

For distinguishability, first consider states SiUTi, S 2 UT 2 , where Si, S 2 C Qi 
and Ti, T 2 £ {s 2 } UQ 2 . The set of final states of the NFA is {s 2 , ^—1}; however, 
any set containing si or g m _i also contains s 2 , and hence is a final state of V. 
Note that applying c always results in a state S U T, where q m -i, s 2 ^ S, and 
applying 6 causes n— 1 ^ T. If T\ ^ T 2 , then applying a cyclic shift d fe transforms 
the states so that n — 1 G Ti © T 2 , and then applying c distinguishing the states. 
If Si ^ S 2 , apply a cyclic shift a k so that q m -\ G Si © S 2 , then apply b to 
distinguish the states. 

Finally, consider the initial state {si,s 2 }, and any state R not contain s\, 
since the initial state is the only one containing s\. There are three cases: 

1. (?o ^ R- Applying a, from {si,s 2 } we reach {gi,0}, and from R we reach 
R' , where q\ R' . By the argument in the second paragraph of the proof, 
{si,s 2 } is distinguished from R. 

2. qo G R, and R {qo, 0}: If ad is applied, then {si, s 2 } goes to {qi, 1}, and R 
goes to R' such that there exists x G g" {gi, 1}. Then these two states 
are distinguishable by the previous argument. 

3. R — {^0:0}: State {si,s 2 } is final, but {go,0} is not. 

Hence all the allowable states are distinguishable and the theorem holds. □ 

5 Stars of Binary Operations 
5.1 The Language (KL)* 

In 2008 Gao, K. Salomaa, and Yu [7] proved that 2 m +™~ 1 + 2 m +™- 4 - (2™- 1 + 
2™ _1 — m — 1) is a tight upper bound for (KL)*. They used the following DFA's 
over alphabet E = {a,b,c,d}: For K, let T>k — (Qi, £<, 8k, qo, {9m-i}) with 
a : (q , . . .,q m -i), b : 1 Qk , c defined by S K (q ,c) = 6 K (q m -i,c) = q , S K (qi,c) = 
qi+i, for i = 1, . . . , m— 2, and d : 1q k . For L, let T>l = (Ql,£, Sl,0, {n— 1}) with 
a : 1q l , b : (0, . . . , n — 1), c : 1q £ , and d defined by £l(0, d) = <5i(n — 1, d) = 0, 
d) = for j = 1, . . . ,n— 2. We show that two permutationally equivalent 
dialects W m (a, 6, c, d) and W„(d, c, 6, a) of U n (a, b, c, d) also meet the bound. 

Theorem 6 ((KL)*). Let K m = W m (a, b, c, d) and L n = W n (d, c, b, a). For 
m,n^3, the complexity of (K m L n )* is 2 m+n - 1 +2 m+n - i -(2 m - 1 +2 n - 1 -m-l). 
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Fig. 7. NFA for ((W 4 (a, b, c, d) W 5 (d,c,b,a))* . 

Proof. Let T> x = (Q x , Si, q Q , with Qi = {g , • • • , OW-i} be the DFA 

of K m , and let X> 2 = (Q 2 , $2, 0, {n - 1}) with Q 2 = {0, . . . , n - 1} be the DFA 
of L n . Let N be the NFA for (KL)*. This NFA is shown in Fig. for m = 4 
and n = 5. Let V be the DFA obtained from Af by the subset construction and 
minimization. 

The states of V are the initial state {s} and states of the form S UT where 
Q S C Qi and T C Q 2 . Because of the e-transitions, the allowable states SUT 
must have either q m -i ^ S or g OT _i € S 1 , and G T. Moreover, if |5| > 1, then 
T 7^ 0, as at least one e-transition from n — 1 to qg must have been used. The 
number of allowable states is counted as follows: 

1. First, we have the initial state {s}. 

2. If T = 0, then \S\ = 1, and q m -i & S. There are m — 1 such states. 

3. If 7V0, then jsj ^ 1. 

(a) n-l&T: If g m _x £ 5", then there are (2 m ^ 1 - l)(2 n ~ 1 - 1) such states. 
Otherwise, q m -\ G S 1 and G T, and there are 2 m+ ™ -3 such states. 

(b) n - 1 G T: Then 9o G S 1 . If q m - X <£ S, there are 2 m+ ™- 3 such states. 
Otherwise, q m ~\ G 5 and G T, and there are 2 m+ ™~ 4 such states. 

Altogether we have 2 m+n - 1 + 2 ro+n " 4 - (2™- 1 + 2"- 1 - m - 1) states. We will 
now show they are all reachable. 

The initial state is {s}. We have {s} \ {q } -^-> {ft} for i < m — 1. 

For i < m — 1 and T = {ii, . . . C Q 2 \{n - 1} with ti < ■ ■ ■ < t k , the 

state {qi}UT is reachable by {<7;}U{t 2 — 1\, ti\ — > {ft}UT. Suppose 

n - 1 e T, say T = . . . , t k , n - 1}. If T ^ Q 2 , then the state {q } U T is 
reachable by a applying a cyclic shift to some {qo} U T", where n — 1 ^ T". 

Moreover, {g m _ 2 }U (Q 2 \{n- 1}) -A {<? , <7i, 9m-i} UQ 2 ^> {g }UQ 2 . Finally, 
if G T then {g m - 2 } UT A {oW-i} U So all allowable states of the form 
SUT, |5| = 1 are reachable. 

Let S = {qi x , ■ ■ ■ ,Qi k }, < «i < ••• < ifc. Since n — 1 ^ T, we have 

{fe-H, • ■ • .fe^JUT d ° > SUT. Now suppose 5 = {qo, q l2 , . . . , If n— 1 G 

T, then {g ,gi 3 -i 2 , • ■ • ,qi k -i 2 } ur ► «5UT. Ifn — l^T and q m -i G 5, 

then T = {0, t 2 , . ■ . , h} and t t < n - 1. Let T' = {0, t 2 — — 1, n — 1}. 
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Then S U T is reachable, and S U T A S U T U {1}; if 1 £ T, apply & 2 to get 
SUT. 

Finally, suppose q a G S, q m -\ & an d — 1 ^ T. Suppose T = {ti, . . . , t{\, 
ti < •■■ < ti, and let T" = {t 2 — ti - 1, . . . , ti — t\ - 1, n — 1}. Since q E S 
and n - 1 S T, state 5 U T" is reachable. Then we reach S U T from S U T" by 
applying d tl+1 . 

Therefore all the allowable states are reachable. 

We now show all states are disintinguishable. Let Si U Ti, S 2 U T 2 be two 
distinct states. If Ti ^ T 2 , then the states are distinguishable by a cyclic shift 
d k . If Si 7^ S 2 , without loss of generality we may assume q m -i £ Si © S 2 . Then 
applying Pd^ 1 results in states Si UT{, S 2 UT^, where e Ti©T 2 , so the states 
are distinguishable. Finally, the initial state {s} is distinguished from every state 
other than {qo} by a; it is distinguishable from {qo} because it is final. □ 

5.2 The Languages (K U L)* 

In 2007 A. Salomaa, K. Salomaa, and S. Yu [10] showed that the complexity 
of (K U L)* is 2 m+n ~ 1 - (2" 1 - 1 + 2™- 1 - 1) with ternary witnesses. Jiraskova 
and Okhotin [5] used binary witnesses: For K, let V K = (Qi, E, Sr, 0, {0}) 
with a : (0, . . . , m — 1), and b defined by <5jf(i, b) = % + 1, for i = 0, . . . , m — 2, 
5k (m - 1,6) = 1. For L, let £> L = (Q L , Z 1 , (5 L , 0, {0}) with a : (°) and & : 
(0, . . . , n — 1). Permutationally equivalent binary dialects of U n (a, b, c) can also 
be used. Let S n = S n (a,b) = (Q, U, 5s, 0, {0}), where a : (0, . ..,n — 1), and 
6 : (, ) . The following theorem was proved in [2] : 

Theorem 7 ((if m UT n )*). Form,n ^ 3, i/ie complexity of (S m (a,b)US n (b, a))* 
is 2" l +™- 1 - (2™- 1 + 2"- 1 - 1). 

5.3 The Language (K n L)* 

It was also proved in [5] that the complexity of (KnL)* is 2" m - 1 + 2" m ~ 2 , which 
is the composition of the complexities of intersection and star. Their witnesses 
K and L were over an alphabet of six letters, £ = {a, b, c, d, e, /}: For K, 
let V K = (Q K ,S,S K ,0,{m - 1}), with Q K = {0, . . . , m - 1}. For T, let V L = 
(Ql 7 S, 5l 7 0, {n — 1}), with Ql = {0, . . . ,n— 1}. The transitions were as follows: 











a : (0, . . . ,m — 1) 


a : 


: (0,...,n- 


1) 




6 : 


(0,...,n- 


1) 


c : (1, . . . , m — 1) 




c : lQi 






rf : 


: (l,...,n- 


1) 


e: (o) 








/ : !Qk 




/:© 





We conjecture that quinary witnesses can also be used. Let U — {a, b, c, d, e} 
and U n (a, b, c, d, e) = (Qk,S, Su,0, {n - 1}), where Q K = (0, . . . , n - 1}, a : 
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(0, 1), b : (0, 1), c : ("~ x ) , d = 1q k , and e : (1, ... n — 1). Let 7r be 
the permutation that sends {a, 6, c, d, e} to {e, c, 6, a, d}, let T>\ = U n (a, b, c, d, e), 
and 2?2 = Un{&, c, b, a, d). The transitions in T>\ and T>2 are: 

2?i P 2 



a : (0, . . . , m — 1) 7r(a) : lg 2 

6: (0,1) 7r(6):(V) 

c:(V) tt(c): (0,1) 

d:l Ql %(d) : (l,...,n- 1) 

e : (l,...,m- 1) 7r(e) : (0, . . . , n - 1) 

Note that U n {a, b, c, d, e) is an extension of U n (a, b, c, d) to 5 letters. 

Conjecture 1 ((K m C]L n )*). Let K m = U m (a, b, c, d, e) and L n = U n (e,c,b,a,d). 
Then the complexity of (Jf m n L n )* is 2" 1 ™- 1 + 2 ro "- 2 for m, n > 3. 

This has been verified for m = 3 and n = 3, 4, 5, 6 and for m = 4 and n = 4, 5. 



5.4 The Language (AT \ L)* 

Theorem 8 ((if m \ L n )*). The complexity of the operation (K m \ L n )* is 
2™" + 2 mn ~ 2 for m,n 3, and if is met by the witnesses K m and L n> where 
K m and L n are the witnesses of Jirdskovd and Okhotin for intersection. 

Proof. This follows since (K \ L)* = (K n L)*. □ 

If Conjectured] holds, then we also have 

Conjecture 2 ((K m \L n )*). Let K m = U m (a, 6, c, d, e) and L n = U n (e, c, 6, a, d). 
Then the complexity of (K m n L n )* is 2" 1 "- 1 + 2 m "- 2 for m, n ^ 3. 

5.5 The Language (if © L)* 

The complexity of this combined operation remains open. 



6 Conclusions 



We have proved that the universal witnesses U n (a, b, c) and U n (a, b, c, d), along 
with their permutational equivalents U n (b,a,c) and U n (d,c,b,a), and dialects 
U{o}, n (a,b,c), T n (a,b, c), T n (b,a,c), W n (a,b,c, d), W{ y, n (a, b, c, d), W n (d, c, b, a) 
suffice to act as witnesses for all state complexity bounds involving binary 
boolean operations and product combined with star. In the case of one or two 
starred arguments, we have shown that it is efficient to consider all four boolean 
operations together. The use of universal witnesses and their dialects simplified 
several proofs, and allowed us to utilize the similarities in the witnesses. 

Acknowledgment We thank Baiyu Li for careful proofreading and correcting 
several flaws in an earlier version of the paper. 
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